A Modified K-Means Clustering with a Density-Sensitive Distance Metric

نویسندگان

  • Ling Wang
  • Liefeng Bo
  • Licheng Jiao
چکیده

The K-Means clustering is by far the most widely used method for discovering clusters in data. It has a good performance on the data with compact super-sphere distributions, but tends to fail in the data organized in more complex and unknown shapes. In this paper, we analyze in detail the characteristic property of data clustering and propose a novel dissimilarity measure, named density-sensitive distance metric, which can describe the distribution characteristic of data clustering. By using this dissimilarity measure, a density-sensitive K-Means clustering algorithm is given, which has the ability to identify complex non-convex clusters compared with the original K-Means algorithm. The experimental results on both artificial data sets and real-world problems assess the validity of the algorithm.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Density-Sensitive Evolutionary Clustering

In this study, we propose a novel evolutionary algorithm-based clustering method, named density-sensitive evolutionary clustering (DSEC). In DSEC, each individual is a sequence of real integer numbers representing the cluster representatives, and each data item is assigned to a cluster representative according to a novel density-sensitive dissimilarity measure which can measure the geodesic dis...

متن کامل

Assessment of the Performance of Clustering Algorithms in the Extraction of Similar Trajectories

In recent years, the tremendous and increasing growth of spatial trajectory data and the necessity of processing and extraction of useful information and meaningful patterns have led to the fact that many researchers have been attracted to the field of spatio-temporal trajectory clustering. The process and analysis of these trajectories have resulted in the extraction of useful information whic...

متن کامل

A Hybrid Data Clustering Algorithm Using Modified Krill Herd Algorithm and K-MEANS

Data clustering is the process of partitioning a set of data objects into meaning clusters or groups. Due to the vast usage of clustering algorithms in many fields, a lot of research is still going on to find the best and efficient clustering algorithm. K-means is simple and easy to implement, but it suffers from initialization of cluster center and hence trapped in local optimum. In this paper...

متن کامل

یادگیری نیمه نظارتی کرنل مرکب با استفاده از تکنیک‌های یادگیری معیار فاصله

Distance metric has a key role in many machine learning and computer vision algorithms so that choosing an appropriate distance metric has a direct effect on the performance of such algorithms. Recently, distance metric learning using labeled data or other available supervisory information has become a very active research area in machine learning applications. Studies in this area have shown t...

متن کامل

Distance Based Hybrid Approach for Cluster Analysis Using Variants of K-means and Evolutionary Algorithm

Clustering is a process of grouping same objects into a specified number of clusters. K-means and Kmedoids algorithms are the most popular partitional clustering techniques for large data sets. However, they are sensitive to random selection of initial centroids and are fall into local optimal solution. K-means++ algorithm has good convergence rate than other algorithms. Distance metric is used...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2006